In task-oriented dialogs such as MultiWoZ (Budzianowski et al., 2018), an informative and/or successful system response needs to include necessary key information such as the phone number of a hotel. Therefore, we hypothesize that by helping the model to focus more on learning key quantities in the dialog, the model can generative more informative and helpful responses. In this paper, we propose a new training algorithm, Reinforced Language Modeling (RLM), that aims to use a fine-grained reward function and reinforcement learning to help the model focus more on generating key quantities correctly during test time. Empirical results show our proposed RLM achieves state-of-the-art performance on the inform rate, success rate, and combined score in MultiWoZ.
translated by 谷歌翻译
6G时代的语义沟通被认为是一个有希望的沟通范式,可以突破传统通信的瓶颈。但是,其在多用户方案中的应用程序,尤其是广播案例,仍未探索。为了有效利用语义沟通启用的好处,在本文中,我们提出了一个一对一的语义通信系统。具体而言,我们建议使用一个启用的深神经网络(DNN),称为MR \ _DeepSc。通过为不同用户的语义功能利用语义功能,基于预训练的模型即Distilbert的语义识别器是为了区分不同用户的。此外,采用转移学习来加快新接收器网络的培训。仿真结果表明,在不同的通道条件下,提出的MR \ _DeepSc可以比其他基准测试获得最佳性能,尤其是在低信噪比(SNR)方面。
translated by 谷歌翻译
变压器编码器模型在对话建模中显示出令人印象深刻的性能。但是,由于变压器在处理长序列方面效率低下,对话历史的长度通常需要被截断。为了解决此问题,我们提出了一种新的内存启动变压器,该变压器与现有的预训练编码器模型兼容,并可以有效地保存历史记录信息。它将单独的内存模块与预训练的变压器一起结合在一起,以在内存状态和当前输入上下文之间有效互换信息。我们在三个对话数据集和两个语言建模数据集上评估我们的模型。实验结果表明,与其他预训练的变压器基线相比,我们的方法已经达到了较高的效率和性能。
translated by 谷歌翻译
尽管最近取得了成功,但基于学习的深度学习方法用于预测身体运动下的3D服装变形,却遇到了服装与身体之间的互穿问题。为了解决这个问题,我们提出了一种新颖的碰撞处理神经网络层,称为排斥力单位(REFU)。根据基础主体的签名距离函数(SDF)和当前的服装顶点位置,Repu预测了将任何互穿顶点推向无冲突的配置,同时保留精细的几何学细节,这些偏移量将任何互穿顶点推向无冲突的配置。我们表明,RECU可以通过可训练的参数进行区分,并且可以集成到预测3D服装变形的不同网络骨架中。我们的实验表明,与基于碰撞损失或后处理优化的先前方法相比,相比,RECU可显着减少身体与服装之间的碰撞数量,并更好地保留几何细节。
translated by 谷歌翻译
由于人类参与者的参与,收集培训对话系统的数据可能非常昂贵,并且需要广泛的注释。特别是在文档接地的对话系统中,人类专家需要仔细阅读非结构化文件以回答用户的问题。结果,现有的文档接地对话对话数据集相对较小,并且妨碍了对话系统的有效培训。在本文中,我们提出了一种通过生成对话模型在文档上接地的自动数据增强技术。对话模型由用户BOT和代理机器人组成,可以在给定输入文档的情况下合成不同的对话,然后用于训练下游模型。在补充原始数据集时,我们的方法可以实现对传统数据增强方法的显着改进。我们还在低资源环境中实现了良好的性能。
translated by 谷歌翻译
随着深度学习(DL)的发展,自然语言处理(NLP)使我们可以分析和理解大量语言文本。因此,在NLP的帮助下,我们可以在联合语义源和噪声频道上进行联合语义源和信道进行语义通信。然而,实现这一目标的现有方法是使用NLP的固定变压器,同时忽略每个句子中包含的语义信息的差异。为了解决这个问题,我们提出了一种基于通用变压器的新语义通信系统。与传统变压器相比,在通用变压器中引入了自适应循环机制。通过引入循环机制,新的语义通信系统可以更灵活地传输具有不同语义信息的句子,并在各种信道条件下实现更好的端到端性能。
translated by 谷歌翻译
在我们的论文中,我们应用了深度加强学习方法,以优化投资组合管理中的投资决策。我们做出了几种创新,例如添加短机制并设计套利机制,并应用我们的模型来为几个随机选择的投资组合进行决策优化。实验结果表明,我们的模型能够优化投资决策,并有能力获得股票市场的超额回报,优化的代理在整个交易期间以固定价值维持资产权重,并以非常低的交易成本率交易。此外,我们还重新设计了用于计算持续交易过程中的投资组合资产权重的公式,这可以使杠杆交易填补了在短路时计算了组合重量的理论差距。
translated by 谷歌翻译
Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not. This paradigm, although effective, is inefficient because the detectors have to go through all patches, severely hindering the inference speed. This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results, enabling a simple and effective solution to object detection in large images. In brief, OAN is a light fully-convolutional network for judging whether each patch contains objects or not, which can be easily integrated into many object detectors and jointly trained with them end-to-end. We extensively evaluate our OAN with five advanced detectors. Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets, meanwhile with consistent accuracy improvements. On extremely large Gaofen-2 images (29200$\times$27620 pixels), our OAN improves the detection speed by 70.5%. Moreover, we extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively, without sacrificing the accuracy. Code is available at https://github.com/Ranchosky/OAN.
translated by 谷歌翻译
A fundamental question in any peer-to-peer ride-sharing system is how to, both effectively and efficiently, meet the request of passengers to balance the supply and demand in real time. On the passenger side, traditional approaches focus on pricing strategies by increasing the probability of users' call to adjust the distribution of demand. However, previous methods do not take into account the impact of changes in strategy on future supply and demand changes, which means drivers are repositioned to different destinations due to passengers' calls, which will affect the driver's income for a period of time in the future. Motivated by this observation, we make an attempt to optimize the distribution of demand to handle this problem by learning the long-term spatio-temporal values as a guideline for pricing strategy. In this study, we propose an offline deep reinforcement learning based method focusing on the demand side to improve the utilization of transportation resources and customer satisfaction. We adopt a spatio-temporal learning method to learn the value of different time and location, then incentivize the ride requests of passengers to adjust the distribution of demand to balance the supply and demand in the system. In particular, we model the problem as a Markov Decision Process (MDP).
translated by 谷歌翻译
为什么网络根本有负权重?答案是:了解更多功能。我们从数学上证明,具有所有非负权重的深神经网络不是通用近似值。许多深度学习文献都假设了这种基本结果,而没有以前证明结果并证明其必要性。
translated by 谷歌翻译